Protein Fold Recognition with K-Local Hyperplane Distance Nearest Neighbor Algorithm
نویسنده
چکیده
This paper deals with protein structure analysis, which is useful for understanding function of proteins and therefore evolutionary relationships, since for proteins, function follows from form (shape). One of the basic approaches to structure analysis is protein fold recognition (protein fold is a 3-D pattern), which is applied when there is no significant sequence similarity between structurally similar proteins. It does not rely on sequence similarity and can be achieved with relevant features extracted from protein sequences. Given (numerical) features, one of the existing machine learning techniques can be then applied to learn and classify proteins represented by these features. In this paper, we experiment with the K-Local Hyperplane Distance Nearest Neighbor algorithm (HKNN) [12] applied to protein fold recognition. The goal is to compare it with other methods tested on a real-world dataset [3]. Two tasks are considered: 1) classification into four structural classes of proteins and 2) classification into 27 most populated protein folds composing these structural classes. Preliminary results demonstrate that HKNN can successfully compete with other methods (by both speed and accuracy) and thus encourage its further exploration.
منابع مشابه
-Local Hyperplane Distance Nearest-Neighbor Algorithm and Protein Fold Recognition
Two proteins may be structurally similar but not have significant sequence similarity. Protein fold recognition is an approach usually applied in this case. It does not rely on sequence similarity and can be achieved with relevant features extracted from protein sequences. In this paper, we experiment with the K -local hyperplane distance nearest-neighbor algorithm [8] applied to the protein fo...
متن کاملDiagnosis of Breast Cancer Tissues Using 785 nm Miniature Raman Spectrometer and Pattern Regression
For achieving the development of a portable, low-cost and in vivo cancer diagnosis instrument, a laser 785 nm miniature Raman spectrometer was used to acquire the Raman spectra for breast cancer detection in this paper. However, because of the low spectral signal-to-noise ratio, it is difficult to achieve high discrimination accuracy by using the miniature Raman spectrometer. Therefore, a patte...
متن کاملClassification by ALH-Fast Algorithm
The adaptive local hyperplane (ALH) algorithm is a very recently proposed classifier, which has been shown to perform better than many other benchmarking classifiers including support vector machine (SVM), K-nearest neighbor (KNN), linear discriminant analysis (LDA), and K-local hyperplane distance nearest neighbor (HKNN) algorithms. Although the ALH algorithm is well formulated and despite the...
متن کاملFeature Normalization and Selection for Protein Fold Recognition
Protein is an amino acid sequence. To determine protein function, which is important in understanding evolutionary relationships, fold recognition is one of the promising techniques to apply, especially when protein sequence identity is below 50%, so that no reliable classification can be obtained from a sequence comparison. Fold recognition is the analysis of proteins based on structure rather...
متن کاملAn affinity-based new local distance function and similarity measure for kNN algorithm
In this paper, we propose a modified version of the k-nearest neighbor (kNN) algorithm. We first introduce a new affinity function for distance measure between a test point and a training point which is an approach based on local learning. A new similarity function using this affinity function is proposed next for the classification of the test patterns. The widely used convention of k, i.e., k...
متن کامل